A Polynomial Time Approximation Scheme for k-Consensus Clustering

نویسندگان

  • Tom Coleman
  • Anthony Wirth
چکیده

This paper introduces a polynomial time approximation scheme for the metric Correlation Clustering problem, when the number of clusters returned is bounded (by k). Consensus Clustering is a fundamental aggregation problem, with considerable application, and it is analysed here as a metric variant of the Correlation Clustering problem. The PTAS exploits a connection between Correlation Clustering and the k-cut problems. This requires the introduction of a new rebalancing technique, based on minimum cost perfect matchings, to provide clusters of the required sizes. Both Consensus Clustering and Correlation Clustering have been the focus of considerable recent study. There is an existing dichotomy between the k-restricted Correlation Clustering problems and the unrestricted versions. The former, in general, admit a PTAS, whereas the latter are, in general, APX-hard. This paper extends the dichotomy to the metric case, responding to the result that Consensus Clustering is APX-hard to approximate. ∗The authors acknowledge the support of the Australian Research Council and the Stella Mary Langford scholarship fund. †The University of Melbourne ‡The University of Melbourne

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A PTAS for the Minimum Consensus Clustering Problem with a Fixed Number of Clusters

The Consensus Clustering problem has been introduced as an effective way to analyze the results of different microarray experiments [5, 6]. The problem consists of looking for a partition that best summarizes a set of input partitions (each corresponding to a different microarray experiment) under a simple and intuitive cost function. The problem admits polynomial time algorithms on two input p...

متن کامل

Title from Practice to Theory; Approximation Schemes for Clustering and Network Design

Abstract What are the performance guarantees of the algorithms used in practice for clustering and network design problems? We answer this question by showing that the standard local search algorithm returns a nearly-optimal solution for low-dimensional Euclidean instances of the traveling salesman problem, Steiner tree, k-median and k-means. The result also extends to the case of graphs exclud...

متن کامل

Title from Practice to Theory, Approximation Schemes for Clustering and Network Design. 2014 Energy-efficient Algorithms for Non-preemptive Speed-scaling. Waoa. Seminar at the 2016 Schloss Dagstuhl Week Event on Algorithms for Optimization Problems in Planar Graphs Algorithms for Embedded Graphs

What are the performance guarantees of the algorithms used in practice for clustering and network design problems? We answer this question by showing that the standard local search algorithm returns a nearly-optimal solution for low-dimensional Euclidean instances of the traveling salesman problem, Steiner tree, k-median and k-means. The result also extends to the case of graphs excluding a fix...

متن کامل

On the Approximation of Correlation Clustering and Consensus Clustering

The Correlation Clustering problem has been introduced recently [N. Bansal, A. Blum, S. Chawla, Correlation Clustering, in: Proc. 43rd Symp. Foundations of Computer Science, FOCS, 2002, pp. 238–247] as a model for clustering data when a binary relationship between data points is known. More precisely, for each pair of points we have two scores measuring the similarity and dissimilarity respecti...

متن کامل

A PTAS For The k-Consensus Structures Problem Under Squared Euclidean Distance

In this paper we consider a basic clustering problem that has uses in bioinformatics. A structural fragment is a sequence of ` points in a 3D space, where ` is a fixed natural number. Two structural fragments f1 and f2 are equivalent if and only if f1 = f2 · R + τ under some rotation R and translation τ . We consider the distance between two structural fragments to be the sum of the squared Euc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010